

#### **RRAM-Based Reconfigurable Computing**

<u>Mohammed Zidan</u> and Wei D. Lu Sept 12, Tysons Corner, VA

# **Ageing Technology**





• K. Rupp, "<u>40 Years of Microprocessor Trend Data</u>," blog

K. Bresniker et al., "Adapting to Thrive in a New Economy of Memory Abundance," Computer 2015.

#### **Innovation Cycle**







(Moor's Low)

### **Modern Applications**

















#### » Monolithic Integration

#### **FPCA Processor**





### Field Programmable Crossbar



#### <u>» Reconfigurable Cores</u>

#### Workload "A"



### Field Programmable Crossbar



#### <u>» Reconfigurable Cores</u>

#### Workload "B"



### Field Programmable Crossbar



#### <u>» Reconfigurable Cores</u>

#### Workload "C"









#### » Binary vs. Analog Devices

| RRAM Type           | Analog  | Binary |
|---------------------|---------|--------|
| Device Levels       | <100    | 2      |
| <b>ON/OFF</b> Ratio | Average | High   |
| Endurance           | Average | High   |
| Programing          | Slow    | Fast   |



#### » Binary vs. Analog Devices

| RRAM Type              | Analog       | Binary       |
|------------------------|--------------|--------------|
| Device Levels          | <100         | 2            |
| <b>ON/OFF</b> Ratio    | Average      | High         |
| Endurance              | Average      | High         |
| Programing             | Slow         | Fast         |
| Data Storage           | $\checkmark$ | $\checkmark$ |
| Digital Computing      |              | $\checkmark$ |
| Neural Networks        | $\checkmark$ |              |
| Internal Data Movement |              |              |



#### » Binary vs. Analog Devices

| RRAM Type              | Analog       | Binary       |
|------------------------|--------------|--------------|
| Device Levels          | <100         | 2            |
| <b>ON/OFF</b> Ratio    | Average      | High         |
| Endurance              | Average      | High         |
| Programing             | Slow         | Fast         |
| Data Storage           | $\checkmark$ | $\checkmark$ |
| Digital Computing      |              | TR & PDE     |
| Neural Networks        | $\checkmark$ | BCNN         |
| Internal Data Movement |              | In-Situ      |





# A. Binary Coded Neural Networks

#### » Binary Synaptic weights

- Our approach is utilized binary RRAM devices to implement a semi-analog (hybrid) neural network.
- Each classical analog device is replaced with "n" binary devices in the new network.
- The number of synaptic-weight bits can be dynamically configured when needed.



#### **Hybrid Neural Network**



#### » Network Structure



### **Hybrid Neural Network**





#### **Hybrid Neural Network**





### **Network Training**

#### » Training Precision Effect

- Low training rate is required to train the network receptive fields (dictionaries) properly.
- This is translated into a larger number of bits to allow small " $\Delta w$ " values.





### **Network Training**

#### » Training Precision Effect

- Typically, training is infrequent or is performed offline.
- Hence, after training the number bits per synaptic weights can be significantly reduced by assigning fewer columns per neuron.



# **Analog Image Compression**

#### M

#### » Sparse Coding

- In analog image compression (sparse coding) each piece of a picture is represented as weighted combination of the network dictionary.
- We adopted locally competitive algorithm (LCA) to perform the analog image compression.



### **Analog Image Compression**



#### <u>» Results</u>

"Original"



#### "Reconstructed"









#### » Crossbar Tree Reduction





#### » Crossbar Tree Reduction



| R | R | R | R | R | R     | R | R |  |
|---|---|---|---|---|-------|---|---|--|
| R | R | R | R | R | R     | R | R |  |
| R | R | R | R | R | R-    | R | R |  |
| R | R | R | S | R | R     | R | R |  |
| R | R | R | R | R | · R . | R | R |  |
| R | R | R | R | R | R     | R | R |  |
| R | R | R | R | R | R     | R | R |  |
| R | R | R | R | R | R     | R | R |  |





#### » Crossbar Tree Reduction

44,800 Simulation points





#### » Crossbar Tree Reduction

$$\begin{bmatrix} A & B & C & D \end{bmatrix} \cdot \begin{bmatrix} E & I & M \\ F & J & N \\ G & K & 0 \\ H & L & P \end{bmatrix} = \begin{bmatrix} X & Y & Z \end{bmatrix}$$







#### » PDE Solver

#### **Electromagnetics**



https://www.horiba-mira.com/

#### **Fluids Flow**



http://www.metrosystems-des.com



#### **Weather Forecasting**



#### **Heat Transfer**



http://www.theseus-fe.com

#### **Many Others**





#### » PDE Solver





#### » PDE Solver







### Memory / Data Storage



#### <u>» Reliable Storage</u>



### **In-Situ Data Migration**



#### » In-Situ Data Migration





#### Summary







| New Computing Devices    | $\checkmark$ |
|--------------------------|--------------|
| No Memory Bottleneck     | $\checkmark$ |
| Classical Process        | $\checkmark$ |
| <b>Cognitive Process</b> | $\checkmark$ |
| Low Power Consumption    | $\checkmark$ |
| Scalable                 | $\checkmark$ |



